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Summary. — Consider the day when an invariant mass peak, roughly compatible 
with "the Higgs", begins to emerge, say at the LHC, ... and may you see that day. 
There will be a difference between discovery and scrutiny. The latter would involve 
an effort to ascertain what it is, or is not, that has been found. It turns out that 
the two concepts are linked: Scrutiny will naturally result in deeper knowledge - is 
*this* what you were all looking for? - but may also speed up discovery. 

PACS 14.80.Bn- 14.80.Cp. 
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1. — Introduction 

Let the single missing scalar of the Standard Model (SM) be called "the Higgs" , to 
stick to a debatable misdeed. Because the idea is so venerable, one may have grown 
insensitive to how special a Higgs boson would be. Its quantum numbers must be those 
of the vacuum, which its field permeates. The boson itself would be the vibrational 
quantum *of* the vacuum, not a mere quantum *in* the vacuum, or in some other 
substance. The couplings of the Higgs to quarks and leptons are proportional to their 
masses. So are its couplings to W^ and Z, a fact that, within the SM, is in a sense 
verified. A significantly precise direct measurement of the Higgs couplings to fermions is 
not an easy task. Even for the heaviest of them, the top quark, the required integrated 
luminosity is large, as illustrated by the ATLAS collaboration on the left of Fig. 1. 

In the past, given a newly discovered particle, one had to figure out its J^^ quantum 
numbers (or its disrespect of the super-indexed ones) to have it appear in the Particle 
Data Book. Publication in the New York Times was not considered that urgent, nor was 
it immediate for bad news. Times have changed. Yet, two groups [1, 2] have thoroughly 
studied the determination of the quantum numbers and coupling characteristics of a 
putative signal at the LHC, that could be the elementary scalar of the SM, or an impostor 
thereof, both dubbed H here. The "golden channel" for this exercise \^ H ^ (ZZ or 
ZZ*) -^ t[ i'^ fl'2 ^2 •, where i^ 2 is an e or a /i, and Z* denotes that, for Ma < 2M^, one 
of the Zs is "off-shell". For a review of previous work on the subject, see e.g. [3]. 

To be realistic (?) let me consider two competing teams. They are working at a pp 
collider of energy ^/s = 10 TeV, luminosity 10^^ cm~^ s~^ and Snowmass factor of 3 (on 
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Fig. 1. - Left: Fractional precision on the measurable ratios of branching ratios for SM H decays 
into VK, Z, t and r pairs as functions of Mh- Right: An example of discovery and scrutiny plot 
of a SM scalar with Mh — 200 GeV, not specially chosen for effect. Xo as in Fig. 2. 



average, things work well 1/3 of the time). The SM is correct, M^ = 200 GeV and the 
estimates of signals and backgrounds are reliable. As the number of events increases. 
Team 2 would then gather evidence for an Mzz peak at the rate shown on the right of 
Fig. 1. Team 1 is additionally checking that, indeed, the object has J^^ = 0++. Tl 
reaches "discovery" (5<j significance) some three months before T2. The horizontal error 
bars, dominated by fluctuations in the expected background, tell us that the two teams 
are *only* la apart (iff from two different experiments!). But that means the probability 
of Tl (from experiment A) being 3 months ahead of T2 (from experiment B ^ A) is 
~ 66% (~ 100% for B = A). The odds for winning with dice, if your competitor lets you 
win for 4 out of the 6 faces are also 66%. If the stakes are this high, would you not play? It 
is interesting to compare the two J^-identity-revealing integrated luminosities in Figs. 1, 
more so since event numbers on its right refer to the chain H -^ ZZ -^ e+e~ /i+/i~ and 
are approximately quadrupled when all M channels are considered. 

Standard signal and background cross sections times branching ratios, a x B, were 
used in Fig. 1. In discussing H impostors we accept that they should not be distinguished 
from a SM H on a x B grounds, which, for all impostors, are hugely model-dependent. 



2. - Methodology 

The technique to be used to measure J^^ for a putative H signal has some pedi- 
gree. Its quantum-mechanical version (called nowadays the "matrix element" method) 
capitalizes on the entanglement of the two Z polarizations and dates back at least to the 
first (correct) measurements of the correlated 7 polarizations in parapositronium (0 ^) 
decay [4]. The technique is even older, as it actually consists in comparing theory and 
observations. The art is in exploiting a maximum of the information from both sides. 

The event-by-event information on the channel at hand is very large, some of it is 
illustrated in Fig. 2, for the decay chain H -^ ZZ -^ e+e~ lJ^^fi~ ^ with H brought to rest. 
The angular variables Q describe Z-pair production relative to the annihilating gg or qq 
pair. The variables Co are the Z-pair decay angles. For fixed 17, cJ, and M* (the mass of 
a lepton pair if its parent Z is off-shell) that is all there is: none less than six beautifully 
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n = {cos 6,$ = (^2} 




cJ = {cos 61 , cos ^2 7 ^ = ^2 — ^1 } 

Fig. 2. - The angles of ZZ pair-production and leptonic decay. Xq = {Q,cj}. 



entangled variables (M[4-^] is also measured event by event, Mh is traditionally extracted 
from a fit to the M[4^] distribution). 

Real detectors have limited coverage in angles and momenta, they "mis-shape" the 
theoretical distributions in the quantities just described. An example for a realistic 
detector and an unrealistic flat expectation is illustrated on the right of Fig. 3. For an 
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Fig. 3. - Detector-shaping effects at Mh — 145 GeV, for all relevant angles and M*. The trigger 
and energy thresholds, resolutions and angular coverage are those of a "typical" detector. 
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H with J = 0, the distribution in Q is flat, so that its inclusion (in this case) would 
seem like an overkill. Not so! detector-shaping effects and the correlations between the 
angular variables conspire to make the use of the full machinery a necessity [2] . 
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Fig. 4. - Left: A signal on an M{ZZ) distribution. Middle: sPlot of the cos^ distribution of 
the "signal" events, compared with the Montecarlo truth and the (detector-shaped) expected 
distribution, for J^^ = 0^^. Right: Same as Middle, for the "background" events. 



There is a wonderful "s- Weighing" method for (much of) the exercise of ascertaining 
the LHC's potential to select the preferred hypothesis for an observed H candidate. Con- 
sider an M[4:£] distribution with an H peak at 250 GeV, constructed with the standard 
expectations for signal and background, as in Fig. 4. Performing a maximum-likelihood 
fit to this distribution one can ascertain the probability of events in each M[4^] bin to be 
signal or background. Next one can astutely (and even statistically optimally) reweigh 
the events into "signal" and "background" categories, to study their distributions in other 
variables [5], such as cos^ = cos^i or cos ^2 in Fig. 4. In this pseudo-experiment one 
knows the "Montecarlo truth", compared in the figure with the impressive s-outcomes 
and the detector-shaped expectation. We use the full (correlated) distributions in all 
mentioned variables, but M//, to confront "data" with different hypothesis. 

The astute reader has noticed that I have not mentioned the r] and pt distributions of 
the ZZ or ZZ* pair (be it an H signal or the irreducible background). Event by event, 
one can undo the corresponding boost but, to ascertain the detector-shaping effects, as 
in Fig. 3, for all the various SM or impostor H objects, one has to use a specific event 
generator. We have done it [2], but we chose to "pessimize" our results in this respect, 
not exploiting the (77, pt) distributions as part of the theoretical expectations (which 
for impostors would be quite model-dependent). One reason is that the relevant parton 
distribution functions (PDFs) will be better known by the time a Higgs hunt becomes 
realistic. Another is that one can use the s- Weigh technique to extract and separately 
plot the (77, Pt) distribution for signal and background. The production of a SM H - 
but not that of most conceivable impostors - is dominated by an extremely theory-laden 
process: gluon fusion via a top loop. As a first step it is preferable *to see* whether or 
not the (77, Pt) distribution of the s-sieved signal events is that expected for gg fusion, 
as opposed to qq annihilation(^). The answer would be fascinating. 



(^) The only impact of the difference between the two production processes is on the detector- 
shaping effects. But these are not large enough for the ensuing differences to affect our results. 
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3. - Theory 

The most general Lorentz-invariant couplings of J = 0, 1 particles to the polarization 
vectors e^ and €2 of two Zs of four- moment a pi and p2 are given by the expressions: 






Xo g^,a + (Po + ^ Qo) e^aarPlPl/M^ - {Yq ^ i Zq) {pi 



-P2)a{Pl 



-P2)jMl, 



The vertex for J = 2 is cumbersome. The quantities X^, Pi... can be taken to be real, 
but for small absorptive effects. The expressions can be used to derive the distribution 
functions pdf{J^^; M*, cos9, $, cos^i, cos^2, V^) allowing one to determine the spin of 
an H and the properties of the HZZ coupling. To give some J = examples: in the SM 
only Xq = g Mz/ cos Ow is nonvanishing. For J = 0~ only Qo 7^ 0. If Xq and Qo (or 
Pq) 7^ 0, the HZZ vertex violates P (or CP). For a "composite scalar" Xq, Yq ^ 0. 

4. — Some results 

While Team 1 members are trying to establish the significance of the discovery of an 
object of specified properties (as in Fig. 1, right), they may, with a few extra lines of 
code, be extracting much more information from the same data set, by asking leading 
questions, NLQs, NNLQs..., whose answers are decreasingly statistically significant. 
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Fig. 5. - Expected confidence levels, as functions of the number of events, to reject the wrong 
hypothesis {Ho, the SM in this case) in favour of the right one (Hi). Left and Right: Hi is 1~, 
for Mh = 145 and 350 GeV. Middle: Hi is 1+, M^j = 200 GeV. 



The quintessential LQ is which of two hypothesis describes the data best, assuming 
that one of them is right. If the hypotheses are "simple" (contain no parameters to be 
fit) the Neyman-Pearson lemma guarantees that the test is universally most powerful. 
Three examples are given in Fig. 5. On its left and right it is seen that it is "easy" (it 
takes a few tens of events) to rule out the SM, if the observed resonance is an Mh = 145 
or 350 GeV vector. On its middle, we see that, if the object is an axial vector, it would 
be much harder. This it is not due to the differing J^, but to the choice Mh = 200 
GeV. For masses close to the H -^ ZZ threshold, the level arm provided by the lepton 
three- momenta is short, and the differences between pdfs is diminished. In fact, as an 
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Fig. 6. - Analogous to Fig. 5, with the hypotheses J^ — 2^ and 0^, once interchanged. 



answer to a NLQ, we have shown that, except close to threshold, it is "easy" to tell any 
J = from any J = 1 object, no matter how general their RZZ couplings are [2]. In 
Fig. 6 we see that it is easy, if the SM is right, to exclude J = 2+ at Mh = 350 GeV, 
but not at 200. We also see that the interchange of right and wrong hypotheses leads to 
very similar expectations. 
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Fig. 7. - Left: Various choices of likelihood functions, employing different sub-optimal sets of 
variables in their 'pdjs, are compared with the choice containing all angular variables and their 
correlations (top- most curve). Right: True and measured values of the mixing angle describing 
a composite scalar, for Mh — 145 GeV. 



On the right of Fig. 7 is the answer to a NNLQ. We have assumed that a composite 
J^^ — 0^^ Higgs has been found and parametrized its ZZ coupling by an angle ^^^ = 
arctan(lo/-^o)- The measured value of ^^^ is seen to be the input one, but for 50 events 
the uncertainties on what the input was, to be read horizontally, are large. For this 
case of a specific J^^, but a complicated coupling, the various terms in the "pdf are 
not distinguishable on grounds of their properties under P and CP. They do strongly 
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interfere for specific values of ^^^, and the results of Fig. 7 are not easy to obtain, 
requiring a full Feldman- Cousins belt construction [2]. 

Given a small data set constituting an initial discovery, one might settle for a stripped- 
down analysis. The cost of such a sub-optimal choice is shown on the left of Fig. 7 for 
Mh =200 GeV, illustrating the discrimination between the 0+ and 1~ hypotheses for like- 
lihood definitions that exploit different sets of variables. N-dimensional pdfs in the vari- 
ables {ai, • • • , qn} are denoted P(ai, • • • , a^)^ while Yii P{^i) is constructed from one- 
dimensional pdfs for all variables, ignoring (erroneously) their correlations. P{uj |(^)th) 
are pdfs including the variables uj and their correlations, but with the hypothesis 1~ 
represented by a pdf in which the variables Q have been integrated out. The likelihood 
P{oj\{^)t¥l) performs badly even relative to P(cJ), which uses fewer angular variables. 
The two differ only in that the first construction implicitly assumes a uniform Air cover- 
age of the observed leptons (an assumption customary in the literature) as if the muon 
Pt and 77 analysis requirements did not depend on the Q angular variables. 




Fig. 8. - The pdfs of the SM at Mh = 200 GeV, integrated in all variables but cos^i and cos ^2. 
Left: the correct P(cos^i, cos^2)- Right: the "approximation" P(cos^i) x P(cos^2)- 



Treating the correlated angular variables as uncorrelated, as in the fli^l^i) exam- 
ple of Fig. 7, not only degrades the discrimination significance but would lead to time- 
dependent, ultimately wrong conclusions. Assume, for example, the SM with mn = 200 
GeV. Let the data be fit to either a fully correlated pdf or an uncorrelated one. The 
projections of the corresponding theoretical pdfs^ involving only the variables cosOi and 
cos ^2, are illustrated in Fig. 8. On the left (right) of the figure we see P[cos^i, cos ^2] 
(P[cos^i]xP[cos^2])- With limited statics - insufficient to distinguish between the cor- 
related and uncorrelated distributions - the correct conclusion will be reached: the data 
are compatible with the SM. But, as the statistics are increased, the data will signifi- 
cantly deviate from the P[cos^i] x P[cos^2] distribution, and a false rejection of the SM 
hypothesis would become increasingly supported. 

The difference between P[cos^i, cos ^2] and P[cos^i]xP[cos^2] is precisely what an 
unbelieving Einstein called spooky action at a distance. But, mercifully for physicists, 
the Lord is subtle *and* perverse. 
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5. — Conclusions 

I have alleged, by way of example, that for a fixed detector performance and integrated 
luminosity (and no extra Swiss Francs) it pays to have ab initio an analysis combining 
discovery and scrutiny. This is arguably true for many physics items other than H -^ AL 
They readily come to mind. 
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